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Abstract 

We  cloned  and  sequenced  the  rDNA  internal  transcribed  spacer  2  (1TS2)  of  4  species  belonging  to  the  neotropical  Anopheles 
(Nyssorhynchus)  albitarsis  complex,  that  is.  A,  albitarsis-,  A.  albitarsis  B;  Anopheles  marajoara,  a  proven  malaria  vector;  and  Anopheles 
deamorum,  a  suspected  vector.  Even  though  the  ITS2  sequences  of  these  species  were  very  similar  (<1.17%  divergence), 
we  found  differences  suitable  for  species  identification  and  intragenomic  variation  of  possible  consequence  in  phylogenetic 
reconstruction.  Variation  came  from  2  microsatellite  regions  and  a  number  of  indels  and  base  substitutions.  The  existence 
of  partially  correlated  subsets  of  clones  in  A.  albitarsis  is  hypothesized  either  to  be  separate  rDNA  loci  or  to  be  semi- 
independendy  evolving  portions  of  a  single  rDNA  locus.  No  differences  were  found  between  males  and  females,  suggesting 
that  similar  rDNA  arrays  exist  on  both  the  X  and  Y  chromosomes.  In  addition,  highly  variant  clones,  possibly  pseudogenes, 
were  found  in  A.  marajoara  from  Venezuela. 


Concerted  evolution  is  the  process  where  all  members  of 
a  multicop'-  gene  family  are  converted  to  the  same  sequence. 
The  mechanism  of  concerted  evolution  has  been  attributed 
to  either  unequal  crossing  over  or  gene  conversion  (Smith 
1976;  Zimmer  et  al.  1980;  Dover  1982).  The  rDNA  is  a  mul¬ 
ticopy  gene  family  dm  exists  as  one  or  more  tandem  arrays  of 
many  transcriptional  units  per  cell  (Gerbi  1985),  where  con¬ 
certed  evolution  rapidly  spreads  mutations  to  all  members  of 
the  gene  family,  even  if  arrays  are  located  on  different  chro¬ 
mosomes  (Dover  1982;  Gerbi  1985;  Tautz  ct  al.  1988).  In 
mosquitoes,  each  rDNA  transcriptional  unit  is  composed 
of  an  external  transcribed  spacer,  an  18S  subunit,  an  internal 
transcribed  spacer  1  (1TS1),  a  5.8S  subunit,  an  ITS2,  and 
a  28S  subunit.  The  rDNA  units  within  an  array  are  linked 
to  each  other  by  an  intergenic  spacer  (IGS).  The  transcribed 
spacers  are  thought  to  contain  conserved  structures  impor¬ 
tant  in  forming  the  mature  ribosomal  amplicon  (Gerbi  1985; 
Thweatt  and  Lee  1 990;  Wesson  et  al.  1 992;  Paskewitz  et  al. 
1993;  van  Nues  et  al.  1995).  The  rDNA  sequence  is  a  valuable 
source  of  information  because  the  functional  regions  that 
produce  the  ribosomes  are  highly  conserved  but  the  tran¬ 
scribed  and  nontranscribed  spacers  have  high  interspecific 
and  low  intraspecific  variability,  making  them  useful  for 
explaining  relationships  of  recently  diverged  species  and  aiso 


useful  as  a  basis  for  polymerase  chain  reaction  (PCR)  iden¬ 
tification  of  morphologically  similar  species.  As  such,  ITS1 
and  ITS2  have  been  used  extensively  in  phylogenetic  recon¬ 
struction  of  closely  related  and  cryptic  species  complexes,  as 
well  as  in  the  development  of  diagnostic  species-specific 
PCR-bascd  markers.  However,  because  PCR  can  amplify 
all  sequences  of  ITS  present  within  the  genome,  variation 
among  ITS  sequences  within  individuals  or  species  could  re¬ 
sult  in  inaccurate  phylogenies  and  erroneous  markers  for  spe¬ 
cies  diagnostics.  Consequently,  identifying  and  quantifying 
levels  of  intragenomic  and  intraspecific  variation  among 
ITS  sequences  are  of  real  importance. 

The  mosquito  genus  Anopheles  (443  formally  named  spe¬ 
cies)  contains  all  the  vectors  of  human  malaria  parasites.  Be¬ 
cause  many  of  the  primary  vectors  belong  to  cryptic  species 
complexes,  it  is  necessary  to  have  accurate  phylogenetic 
reconstructions  and  species  diagnostics  for  the  study  of 
malaria  transmission  and  its  relation  to  Anopheles  evolution. 
Sequences  of  ITS1  and  ITS2  are  an  excellent  source  for  such 
information.  However,  in  Anopheles,  there  are  examples  of 
rDNA  intragenomic  variation  (Wilkerson  ct  al.  2004;  Fairley 
et  al.  2005),  but  its  prevalence  and  magnitude  is  not  well  stud¬ 
ied.  A  consideration  in  search  of  an  explanation  for  Anopheles 
intragenomic  rDNA  sequence  variation  is  the  possibility  that 
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rDNA  arrays  are  linked  to  different  sex  chromosomes,  that 
is,  they  have  been  found  only  on  the  X  of  some  species 
(Collins  et  al.  1 989)  or  on  both  the  X  and  Y  of  others  (Marchi 
and  Pili  1994).  Consequently,  rDNA  arrays  on  sex  chromo¬ 
somes  that  exhibit  limited  recombination  could  result  in  in¬ 
complete  homogenization. 

In  this  study,  we  examine  ITS  sequences  from  multiple 
individuals  of  4  closely  related  species  of  the  neotropical 
Anopheles  albitarsis  complex.  These  include  Anopheles  marajoara 
Galvao  &  Damesceno  (Brazil,  Venezuela,  Colombia,  and 
southern  Central  America),  a  known  carrier  of  malaria  (Conn 
et  al.  2002);  Anopheles  deaneorum  Rosa-Freitas  (northern 
Argentina  to  western  Brazil),  a  suspected  malaria  vector 
(Klein,  Lima,  and  Tada  1991;  Klein,  Lima,  Tada,  and  Miller 
1991);  and  2  other  species  A.  albitarsis  Lynch-Arribalzaga 
(southern  Brazil,  northern  Argentina,  and  Paraguay)  and 
A.  albitarsis  B  (south,  central,  and  eastern  Brazil),  whose  role 
in  malaria  transmission  is  unclear.  The  4  species  can  be  re¬ 
liably  separated  by  random  amplified  polymorphic  DNA 
(RAPD)  (Wilkerson,  Gaffigan,  and  Lima  1995;  Wilkerson, 
Parsons,  et  al.  1995)  and  white  gene  (Merritt  et  al.  2005). 
We  initially  sought  to  examine  the  phylogenetic  relationships 
among  these  species  employing  a  number  of  genes,  including 
ITS2  and  cytochrome  oxidase  I  (COl)  (Wilkerson  et  al. 
2005).  We  observed  ambiguous  results  from  direct  sequenc¬ 
ing;  thus,  we  sought  to  clone  and  sequence  ITS2  sequences 
from  these  species  to  quantify  the  magnitude  and  prevalence 
of  intragenomic  ITS2  variation  and  determine  its  effect  on 
phylogenetic  reconstruction.  In  addition,  we  sampled  both 
male  and  female  individuals  within  each  species  to  investigate 
possible  rDNA  gender  differences. 

Materials  and  Methods 

Taxon  Sampling 

Morphological  identification  of  A.  albitarsis  s.l.  was  carried 
out  using  characters  found  in  Linthicum  (1 988)  and  Peyton 
et  al.  (1992).  Specimens  used  for  cloning  and  sequencing  are 
given  in  Table  1.  They  represent  examples  from  progeny 
broods  reported  in  Wilkerson,  Gaffigan,  and  Lima  (1995) 
and  Wilkerson,  Parsons,  et  al.  (1995)  from  widely  separated 
parts  of  the  species  ranges,  including  type  localities  of  the 
3  named  species,  and  both  sexes.  In  addition,  a  larger  sample 
of  A.  marajoara  is  represented  because  of  its  wide  distribution 
and  the  possibility  of  a  cryptic  species,  A.  albitarsis  E  (Lehr 
et  al.  2005).  For  brevity,  letter  designations  are  sometimes 
used  that  follow  those  in  Wilkerson,  Gaffigan,  and  Lima 
(1995)  and  Wilkerson,  Parsons,  et  al.  (1995):  A  =  A.  albitarsis, 
B  =  A.  albitarsis  B,  C  =  A.  marajoara,  D  =  A.  deaneorum,  for 
example,'  in  Tables  1—3.  The  ITS2  sequence  reported  here 
was  used  to  design  diagnostic  primers  (Li  and  Wilkerson 
2005)  that  correctly  identified  all  specimens  first  recognized 
with  RAPD  markers  (Wilkerson,  Gaffigan,  and  Lima  1995; 
Wilkerson,  Parsons,  et  al.  1995)  as  follow:  A.  albitarsis, 
n  =  56;  A.  marajoara,  n  =  407;  A.  deaneorum,  n  =  41;  and 
A.  albitarsis  B,  n  —  56.  Because  there  was  complete  concor¬ 
dance  of  data  sets  for  a  relatively  large  sample  from  many 


locations,  we  were  able  to  base  our  conclusions  on  a  much 
smaller  number  of  cloned  individuals. 

DNA  Processing 

DNA  was  isolated  from  individual  adult  mosquitoes  by 
phenol-chloroform  extraction  as  described  in  Wilkerson 
et  al.  (1993).  The  ITS2  region  was  amplified  using  PCR  prim¬ 
ers  based  on  conserved  sequences  in  the  5.8S  and  28S 
ribosoma)  subunits  of  A.  quadrimaculatus  Say  (Cornel  et  al. 
1996).  The  boundaries  of  the  1TS2  were  determined  as  in 
Cornel  et  al.  (1996,  Figure  1  A).  PCRs  were  carried  out  as  de¬ 
scribed  in  Li  and  Wilkerson  (2005).  Amplified  PCR  prod¬ 
ucts  were  cleaned  using  QIAquick  PCR  purification  kit 
(Promega,  Madison,  WI).  About  200  ng  of  each  purified  PCR 
product  was  ligated  into  pCR-TOPO  plasmid  (Invitrogen, 
Carlsbad,  CA).  Two  microliters  of  the  ligation  reaction  mix¬ 
ture  was  then  transformed  into  competent  One  Shot  cells 
(TOPO  TA  Cloning  Kit,  Invitrogen).  Transformed  cultures 
were  plated  on  Luria-Bcrtani  plates  containing  5-bromo-4- 
chloro-3-indoIyl-beta-D-galactopyranoside,  isopropyl-beta- 
D-thiogalactopyranoside,  and  50  (ig/ml  ampicillin.  Successful 
insertions  are  confirmed  by  PCR.  Plasmids  were  extracted  by 
the  mini-prep  method  (Sambrook  et  al.  1989).  Sequencing 
and  alignment  were  as  described  in  Li  et  al.  (2005).  Sequence 
statistics  were  obtained  using  PAUP  version  4.0b4  (Swofford 
1998).  GenBank  accession  numbers  are  given  on  Table  2. 

Genetic  Distance  and  Phylogenetic  Analysis 

Uncorrected  “/>”  pairwise  distances  were  calculated  by  PAUP 
version  4.0bl0  (Swofford  1998).  The  aligned  ITS2  sequences 
were  analyzed  by  maximum  parsimony  (MP)  as  implemented 
in  PAUP  and  Bayesian  analysis  carried  out  using  MRBAYES 

3.1  (Hueisenbeck  and  Ronquist  2001).  The  parsimony  and 
Bayesian  analyses  were  chosen  because  gap  information 
can  be  incorporated  into  both.  Each  gap  was  treated  as  a  sin¬ 
gle  character  regardless  of  the  length  of  the  gap,  under  the 
assumption  that  a  given  gap  is  a  result  from  one  mutational 
event  (Simmons  and  Ochoterena  2000).  Single  unique  muta¬ 
tions  were  disregarded  because  of  the  possibility'  that  they 
were  the  result  of  Taq  replication  error.  Parsimony  analysis 
was  conducted  using  the  heuristic  search  option  with  TBR 
(tree-bisection-reconnection)  branch-swapping  algorithm. 
Parsimony  bootstrapping  was  done  with  1000  pseudorepli- 
cates  with  10  random  taxon  addition  replicates  per  pseudor- 
eplicate.  For  Bayesian  analysis,  we  used  MRMODELTEST 

2.2  (Nylander  2004)  to  choose  an  input  evolutionary  model. 
Markov  chain  Monte  Carlo  runs  were  2  x  10“  generations 
long  with  sampling  every  5  x  103  generations,  for  a  total 
of  4001  samples.  Of  these,  the  first  1001  were  discarded 
as  burn-in,  which  is  well  past  the  point  where  the  likelihood 
plot  reached  a  plateau. 

RNA  Secondary  Structure 

The  putative  secondary  structure  of  the  ITS2  was  esti¬ 
mated  using  MFOLD  (Zuker  et  al.  1999).  A  /-distribution  was 
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Table  I.  Collection  localities,  number  of  clones,  and  GenBank  accession  numbers  for  specimens  used  in  cloning  of  rDNA  1TS2  of 
species  belonging  to  the  Anopheles  (Nyssorhynchus)  alhilarsis  complex 


Species  (M  =  male, 

F  =  female) 

Code 

Country 

State 

Locality 

Coordinates 

alhilarsis  (Al) 

BR504(8) 

Brazil 

Parana 

Near  Guaira 

24°04'S,  54°15'W 

alhilarsis  (AM2,  AF2) 

AR7(8) 

Argentina 

Buenos  Aires 

Baradero  (type  locality) 

33°48'S,  59°30'W 

alhilarsis  B  (Bl) 

BR019(12) 

Brazil 

Ceara 

Paraipaba 

3°25'S,  39°13'W 

alhilarsis  B  (BM2,  BF2) 

BR/SP  500(1) 

Brazil 

Sao  Paulo 

Near  Registro 

24°37'S,  47°53'W 

marajoara  (Cl) 

BR026(12) 

Brazil 

Amazonas 

Manaus 

2°53'S,  60°15'W 

marajoara  (CF2,  CM2) 

BR/R001(10) 

Brazil 

Para 

Marajo  Island  (type  locality') 

TOO'S,  49°30'W 

marajoara  (C3) 

COJ9 

Venezuela 

Cojedes 

Finca  “Rosa  Blanca” 

Not  known 

marajoara  (C4) 

COJ  10 

Venezuela 

Cojedes 

Finca  “Rosa  Blanca” 

Not  known 

marajoara  (C5) 

BR4 

Brazil 

Roraima 

Boa  Vista 

2°45'28"N,  60o42'18"W 

marajoara  (C6) 

PIS9 

Brazil 

Amapa 

North  of  Amapa 

Not  known 

marajoara  (C7) 

1TB 13763 

Brazil 

Para 

Near  Itaituba 

Not  known 

deaneontm  (Dt) 

BR/R007(15) 

Brazil 

Rondonia 

Guajara  Mirim 

10°50'S,  65°20'W 

deaneontm  (DF2,  DM2) 

AR3(4) 

Argentina 

Corrientes 

Corrientes 

27°28'S,  59°50’W 

deaneontm  (DF3) 

AR2(3) 

Argentina 

Corrientes 

90  km  West  of  Posadas 

Not  known 

letter  designations,  Al,  A2,  etc,  correspond  to  Table  3. 


calculated  to  compare  the  minimum  free  energy  levels  of  all 
clones  given  by  MFOLD  (Sokal  and  Rohlf  1981). 

Results 

We  cloned  ITS2  PCR  products  from  each  sex  of  A.  alhilarsis 
(«  =  3),  A.  alhilarsis  B  («  =  3),  A.  deaneontm  (»  =  4),  and 
A.  marajoara  (n  =  8).  Individuals  from  widely  separated  local¬ 
ities,  including  type  localities,  were  used  as  described  above 
and  in  Table  1.  The  larger  sample  of  A.  marajoara  served  to 
test  for  consistency  of  sequence  in  this  widely  distributed  spe¬ 
cies  and  to  test  the  hypothesis  of  the  fifth  species  (Lehr  et  al. 
2005).  The  number  of  clones  from  the  18  total  individuals 
ranged  from  3  to  28  (Table  2),  giving  a  total  of  217  clones. 
Alignment  of  sequences  was  straightforward  because  there 
was  litde  sequence  variation.  Unless  otherwise  stated,  the  fol¬ 
lowing  description  does  not  apply  to  2  variant  A.  marajoara 
clones,  C3.1  and  C4.1,  from  individuals  COJ9  and  COJ 10 
from  Cojedes,  Venezuela  (Tables  1  and  2),  which  we  discuss 
separately. 

Inter-  and  Intragenomic  Variation 

Total  length  of  the  ITS2  ranged  from  344  to  365  bp.  There 
were  4  microsatellite  regions,  (GT)s_7  at  position  118, 
(GA)j.||  at  273,  (CT)4  at  147,  and  (GQ3  at  345,  all  of  which 
were  common  to  all  4  species  (Table  2).  The  first  2  regions 
were  variable  and  contributed  to  all  the  length  and  intrageno¬ 
mic  variations  of  ITS2  within  A.  alhilarsis  B  and  A.  deaneontm. 
However,  repeat  number  was  not  species  specific.  There 
were  3  interspecific  and/or  intraspecific  2-  or  3-basc  indels 
at  positions  34,  271,  and  236  and  8  single-base  substitutions 
at  positions  30,  43,  80,  248,  260,  268,  276,  and  328. 

The  polymorphic  ACC  and  GC  indels  in  A.  alhilarsis  oc¬ 
curred  concordandy  in  clones  from  2  individuals  in  about 
equal  proportions,  5  of  15  in  specimen  Al  and  5  of  20  in 
specimen  AM2.  The  third  individual  of  A.  alhilarsis  (AF2)  also 
had  a  low  proportion  of  ACC  indel  clones  (1  of  21),  but  the 


GC  indel  was  not  present  in  our  sample.  There  was  no  in¬ 
dication  of  any  obvious  correlation  of  the  other  2  polymor¬ 
phic  sites  (positions  43  and  328)  in  this  species  with  each 
other  or  the  ACC  and  GC  indels. 

Phylogenetic  Analysis 

MP  and  Bayesian  analyses  were  carried  out  for  clones  from 
3  individuals  of  each  species  with  the  microsatellite  regions  re¬ 
moved  and  indels  coded  as  0  or  1  (Simmons  and  Ochoterena 
2000).  For  clarity,  because  the  combined  and  separate  results 
were  nearly  identical,  we  present  results  from  a  single  indi¬ 
vidual  (Figure  1)  of  each  species.  Tree  topology  was  the  same 
for  both  analyses,  but  branch  support  was  better  with  Bayes¬ 
ian  analysis  (support  for  both  shown  in  Figure  1).  Anopheles 
deaneontm ,  A.  marajoara,  and  A.  alhilarsis  B  all  clustered  into 
separate  groups.  However,  A.  alhilarsis  clones  separated  into 
2  groups  corresponding  to  the  correlated  and  partially  corre¬ 
lated  ACC  and  GC  indels  described  above.  Variation  among 
clones  was  slight,  with  intragenomic  base  differences  ranging 
from  0.0%  to  0.57%,  intraspecific  variation  ranging  from 
0.0%  to  0.60%,  and  interspecific  variation  ranging  from 
0.28%  to  1.17%  (Table  3). 

Additional  A  marajoara  Clones 

Twenty-six  clones  from  3  individuals  representing  A  alhilarsis 
E  of  Lehr  et  al.  (2005)  were  sequenced,  1  from  Boa  Vista  in 
northern  Brazil  and  2  from  Venezuela.  Except  for  rare 
mutations,  sequences  of  these  clones  matched  sequences 
from  other  collection  sites,  including  the  type  locality  of 
A.  marajoara,  Marajo  Island,  Brazil  (Table  2). 

Variant  A  marajoara  Clones 

Significant  divergence  was  seen  in  a  single  clone  from  2  indi¬ 
viduals,  COJ9  (clone  C3.1)  and  COJ  10  (clone  C4.1),  from 
Cojedes,  Venezuela.  These  sequences  were  similar  to  each 
other  but  quite  different  from  all  other  clones  (Table  2). 
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Table  2.  The  rDN A  1TS2  sequences  that  differ  among  Anophelti  albitarsis  complex  species 


_  Initial  dG  GenBank 

Specimen  34-  118-162.  236-  269-  271-  273-  326.  (kcal/  Allele  Accession 

No.  Ratio  30  35  43  57  64  80  110  118  120  129  131  171  181  207  213  238  248  250  258  260  268  270  272  276  294  323  327  328  338  344  mole)°  name  No. 


Species  are  as  follow:  A  =  A.  atbitarih,  B  =  A.  albitanit  B,  C  *=*  Anopbtkt  matajottm,  D  =  Anopbeiti  ihantorura.  Columns  1  and  2  arc  specimen  number,  clone  number,  and  number  of  clones/ total,  for  example,  Al.l  is  A.  afbitanu  specimen  I,  set  of 
like  clones  number  1,  which  was  found  in  3  of  15  renal  clones.  M  and  I;  denote  male  anil  female,  if  known.  Only  variable  bases  arc  shown.  Single  mutations  that  appeared  only  once  were  run  considered  because  of  the  possibility  they  were  due  to  TtUf 
errors. 

'  The  initial  dCi  is  the  energy  level  of  folded  KNA.  The  region  for  the  test  includes  91  bases  in  the  5.8S  subunit  and  43  bases  in  the  28S. 
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Table  3.  Uncorrected  “p"  distance  matrix  of  clones  from  Al,  Bl,  Cl,  and  D1 


Al.l 

A  1.2 

AI.3 

AI.4 

Bl.l 

BI.2 

BI.3 

BI.4 

Cl.l 

Cl. 2 

Dl.l 

DI.2 

DI.3 

A1.2 

0.0028 

— 

A  1.3 

0.0057 

0.0028 

— 

A  1.4 

0.0057 

0.0028 

0 

— 

Bl.l 

0.0085 

0.0113 

0.0086 

0.0086 

— 

B1.2 

0.0086 

0.01 14 

0.0087 

0.0086 

0 

— 

B1.3 

0.0057 

0.0085 

0.0057 

0.0057 

0.0028 

0.0029 

— 

B1.4 

0.0086 

0.0114 

0.0087 

0.0086 

0 

0 

0.0029 

— 

Cl.l 

0.0057 

0.0086 

0.0115 

0.0115 

0.0143 

0.0144 

0.01)5 

0.0144 

— 

C1.2 

0.0028 

0.0057 

0.0086 

0.0085 

0.0114 

0.0114 

0.0086 

0.0115 

0.0029 

— 

Dl.l 

0.0028 

0.0057 

0.0087 

0.0087 

0.0114 

0.0115 

0.0086 

0.0114 

0.0085 

0.0057 

— 

D1.2 

0.0028 

0.0057 

0.0088 

0.0087 

0.0116 

0.01 17 

0.0087 

0.0116 

0.0087 

0.0058 

0 

— 

D1.3 

0.0028 

0.0057 

0.0088 

0.0088 

0.0117 

0.0118 

0.0088 

0.0116 

0.0087 

0.0058 

0 

0 

— 

D1.4 

0.0028 

0.0056 

0.0086 

0.0086 

0.0113 

0.01 14 

0.0084 

0.0114 

0.0086 

0.0057 

0 

0 

0 

Clone  C3.1  differed  from  other  conspecific  clones  by  5.6- 
6.6%  and  C4.1  differed  by  3.5— 4.1%.  Genetic  difference 
between  the  2  variant  clones  was  2.0%. 

Secondary  Structure  of  rRNA 

The  secondary  structures  of  rDNA  ITS2  were  predicted 
by  MFOLD  (Zuker  et  al.  1999).  Minimum  free  energies  in 
kilocalories/mole  were  —181.5  to  —185.5  for  A.  aibitarsis, 
—  176.7  to  —180.0  for  A.  aibitarsis  B,  —175.6  to  —179.0 
for  A.  marajoara,  and  —178.6  to  —183.4  for  A.  deaneorum. 
The  structures  of  the  2  variant  A.  marajoara  clones,  C3.1 
and  C4.1,  have  significantly  lower  energy  (—167.4  and 
— 168.9  kcal/mole;  P  <  0.01  in  the  Student’s  /-test)  and  pre¬ 
sumably  lower  stability  than  other  A.  marajoara.  Figure  2 A 
shows  the  predicted  folding  structures  of  all  clones  in  the 
aibitarsis  complex  except  C3.1  (Figure  2B)  and  C4.1 


Figure  I.  MP  tree  generated  from  rDNA  1TS2  sequence 
derived  from  individuals  Al,  Bl,  Cl,  and  D1  in  Table  3. 
Species  (number  of  clones):  Anopbclts  aibitarsis  (15),  A.  aibitarsis 
B  (10),  Anopheles  marajoara  (15),  and  Anopheles  deaneorum  (1 1). 
The  same  topology  is  found  when  AM2,  AF2,  BM2,  BF2, 
CM2,  CF2,  DM2,  and  DF2  are  combined.  Bootstrap  values  are 
on  the  branches.  The  first  numbers  are  from  MP  analysis;  the 
second  number  in  parenthesis  is  from  MRBAYES  analysis. 


(Figure  2Q.  Note  that  the  stem  and  loop  near  the  presump¬ 
tive  1TS2  excision  site  (Fritz  et  al.  1994)  (next  to  the  right 
arrow  in  Figure  2A)  is  missing  in  the  2  variant  clones 
(Figure  2B,Q. 

Discussion 

A  basic  assumption  about  multigene  families,  such  as  rDNA, 
is  that  the  processes  collectively  referred  to  as  concerted  evo¬ 
lution  (gene  conversion  and  unequal  crossing  over)  maintain 
homogeneity  of  all  copies  (Hood  et  al.  1975;  Smith  1976; 
Zimmer  et  al.  1980;  Dover  1982).  Mutations  rapidly  spread 
to  all  members  of  the  gene  family  even  if  there  are  arrays 
located  on  different  chromosomes  (Dover  1982;  Amheim 
1983;  Gerbi  1985;Tautz  et  al.  1988).  In  the  case  of  noncoding 
regions,  such  as  ITS2,  this  can  lead  to  fixed  interspecific  dif¬ 
ferences  and  intraspecific  homogeneity.  The  efficiency  of 
homogenization  of  rDNA  is  usually  high  (Liao  1999),  as 
exemplified  by  its  common  use  as  a  marker  for  mosquito 
identification,  most  of  which  are  derived  from  1TS2  (exam¬ 
ples  given  in  Wilkerson  et  al.  2004).  However,  as  our  results 
show,  when  mutation  rates  are  higher  than  rates  of  homog¬ 
enization,  then  variation  within  individuals  may  be  greater 
than  that  observed  between  populations  (see  also  Fritz 
et  al.  1994;  Onyabe  and  Conn  1999;  Wilkerson  et  al.  2004). 
This  possibility  should  be  accounted  for  before  rDNA  is 
used  for  phylogenetic  or  population  studies  or  as  a  basis  for 
species-specific  PCR  primers. 

ITS2  Variation 

The  1TS2  of  all  4  RAPD-determincd  species  in  the  Aibitarsis 
Complex  were  intragenomically  and  interspecifically  variable. 
Length  variation  was  limited  (344—365  bp)  and  mosdy  attrib¬ 
utable  to  the  2  variable  microsatellite  regions.  In  addition, 
there  were  a  number  of  indels  and  base  substitutions  account¬ 
ing  for  both  the  length  and  sequence  variabilities  (see  Results 
and  Table  2).  Anopheles  aibitarsis  differed  from  the  other  spe¬ 
cies  in  having  intragenomically  variable  ACC  and  GC  indels 
(positions  236  and  271)  and  a  variable  T/C  mutation  at 
position  43.  The  ACC  indel  and  the  T/C  mutations  were 
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Figure  2.  Predicted  secondary  structure  of  rDNA  ITS2,  including  a  combined  1 34  bases  from  the  flanking  5.8S  (91)  and  28S  (43) 
subunits.  The  secondary  structure  common  to  clones  from  all  species  (A)  except  for  the  2  clones  shown  in  (B)  and  (C),  which 
were  found  in  2  individuals  of  Anopheles  marajoara  from  Cojedes,  Venezuela. 


used  by  Li  and  Wilkerson  (2005)  to  design  species-specific 
primers  to  identify  A.  albitarsis,  A.  a/bi/arsis  B  and  A.  deaneorum 
as  a  group,  and  A.  albitarsis ,  respectively.  Even  though  the 
above  3  (ACC,  GC,  and  T/Q  differences  are  not  fixed  in 
A.  albitarsis,  PCR  primers  designed  based  on  them  still  am¬ 
plified  as  if  there  were  only  target  sequence  present  and  there¬ 
fore  still  functioned  to  diagnose  the  species  or  groups  of 
species. 

Clones  of  A.  albitarsis  ITS2  showed  greater  diversity  than 
the  other  3  species.  In  this  case,  intragenomic  1TS2  variation 
within  A.  albitarsis  was  greater  than  that  between  species  in 
the  complex.  For  example,  the  genetic  distance  between  A  1.1 
and  A  1.3  (A.  albitarsis )  was  0.57%,  whereas  the  difference 
between  Al.l  and  Dl.l  (A.  deaneorum)  was  0.28%  (Table  3). 
This  is  an  apparent  example  of  mutation  rates  that  are  higher 
than  homogenization  rates.  Intragenomic  variation  at  ITS2, 
and  in  other  parts  of  the  rDNA  gene  array,  is  probably  very 
common  (Harris  and  Crandall  2000).  In  Anopheles  mosqui¬ 
toes,  intragenomic  variation  has  also  been  found  in  a  num¬ 
ber  of  other  Anopheles  species  (Onyabe  and  Conn  1999; 
Wilkerson  et  al.  2004;  Fairley  et  al.  2005)  and  in  other  mos¬ 
quitoes  in  subfamily  Culicinae  (Black  et  al.  1989;  Wesson 
et  al.  1992;  Miller  et  al.  1996;  Beebe  et  al.  2000). 

Effect  of  Microsatellites  on  Phylogenetic  Reconstruction 

Highly  variable  microsatellites  may  have  confounding  effects 
on  phylogenetic  and  population  genetics  analyses.  Harris  and 
Crandall  (2000)  noted  that  if  the  multicopy  nature  of  a  marker 
is  not  recognized,  inconsistent  results  can  occur  because 


alleles  will  not  be  distributed  in  a  Mcndelian  manner.  Cloning 
results  verified  our  hypothesis  that  microsatellite  variation 
was  responsible  for  ambiguous  sequencing  results.  However, 
in  our  case  (data  not  shown),  and  in  that  of  Voglcr  and 
DeSalle  (1994),  phylogenetic  results  were  not  affected  by 
exclusion  of  microsatellite  regions. 

Chromosome  Location  of  rDNA  Arrays 

Figure  1  shows  2  clusters  of  clones  from  the  same  individual 
of  A.  albitarsis  that  are  as  different  from  each  other  as  they 
are  from  the  other  3  species.  This  suggests  either  that  there 
are  2  rDNA  loci  within  A.  albitarsis  or  that  there  ate  semi- 
independently  evolving  homologous  rDNA  loci.  This  could 
be  caused  by  inefficient  gene  conversion  and  gene  recombi¬ 
nation.  Multiple  rDNA  locations  are  not  unusual,  for  exam¬ 
ple,  there  are  5  in  humans  (Gonzalez  and  Sylvester  2001)  and 
at  least  2  in  Drosophila  bydei  (Hennig  et  al.  1975)  and  grass¬ 
hoppers  (White  et  al.  1982).  Similar  explanations  were  con¬ 
sidered  for  other  Anopheles  mosquitoes  by  Onyabe  and  Conn 
(1999)  and  Beebe  et  al.  (2000). 

The  rDNA  arrays  are  usually  on  chromosomes  associated 
with  sex  determination.  Kumar  and  Rai  (1990)  and  Marchi 
and  Pili  (1994)  mapped  dozens  of  species  of  mosquitoes 
and  found  rDNA  loci  on  the  autosomes  of  culicine  mosqui¬ 
toes  and  on  the  X  and  Y  chromosomes  of  an  Anopheles.  In 
addition,  they  found  loci  on  heterologous  chromosomes  in 
genus  Aedes,  the  only  confirmed  example  of  loci  on  different 
chromosomes  found  so  far  in  mosquitoes.  In  the  Gambiae 
Complex  (subgenus  Cel/ia),  Anopheles  gambiae  Giles  and 
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Anopheles  arabiensis  Patton  have  rDNA  only  on  the  X  chro¬ 
mosome,  whereas  in  the  other  species  of  the  complex,  it 
is  on  the  X  and  Y  chromosomes  (Collins  et  al.  1989).  In 
2  Anopheles  subgenus  Nyssorhyndm  species,  Rafael  et  al. 
(2003)  found  rDNA  on  both  the  X  and  Y  chromosomes. 
If  rDNA  was  associated  only  with  the  X  chromosome,  as 
it  is  in  A.  gambiae,  then  males  would  be  expected  to  have  half 
the  number  of  rDNA  cistron  copies  (Collins  et  al.  1989)  and 
half  the  haplotype  diversity.  If  there  were  a  subset  of  rDNA 
associated  with  the  Y  but  not  the  X  chromosome,  then  only 
males  would  be  expected  to  have  the  Y-associated  rDNA.  In 
our  sample,  we  did  not  see  higher  haplotype  diversity  asso¬ 
ciated  with  males  or  females. 

Polanco  et  al.  (2000)  proposed  2  models  to  account  for 
apparendy  correlated  sets  of  rDNA  other  than  loci  on  sep¬ 
arate  chromosomes:  a  haplotypic  single-lineage  model  for 
ITS  evolution  and  a  multilineage  model  for  IGS  evolution. 
The  X  and  Y  chromosomes  in  Anopheles  are  only  pardaily  ho¬ 
mologous,  and  X  chromosome  variants  do  occur  (Baimai 
et  al.  1993;  Rafael  et  al.  2003).  Such  factors  may  contribute 
to  incomplete  homogenization  and  could  explain  our  finding 
of  partially  correlated  intragenomic  1TS2  haplotypes.  As 
employed  in  the  above  studies,  physical  mapping  using  in  situ 
hybridization  is  needed  to  confirm  the  locadon  of  rDNA  loci 
in  the  A.  albitarsis  complex  species. 

Anopheles  albitarsis  Species  E 

Based  on  complete  sequence  of  the  mitochondrial  COI,  Lehr 
et  al.  (2005)  proposed  a  fifth  species  (A.  albitarsis  E)  for  the 
albitarsis  complex  in  northern  Brazil  and  Venezuela.  We 
found  no  evidence  from  ITS2  sequence  to  support  their  con¬ 
clusions.  Isosequential  ITS2  can  occur  in  closely  related 
Anopheles  species  (see  above),  and  additional  data  are  neces¬ 
sary'  to  resolve  this  question. 

Variant  A.  marajoara  Clones 

Anopheles  marajoara  individuals  COJ9  and  COJ10  from 
Cojedes,  Venezuela,  each  had  a  different  highly  divergent 
clone  (Table  2).  The  sequences  are  similar  to  the  other 
A.  marajoara  ITS2  but  differ  from  each  other  by  about  as 
much  as  A.  marajoara  does  from  the  other  3  species.  One 
of  the  clones  (C3.1)  has  many  mutations  throughout  its 
length,  whereas  the  other  (C4.1)  is  the  same  as  all  the  other 
A.  marajoara  clones  up  until  position  207,  after  which  it 
mirrors  the  mutations  in  the  more  divergent  clone.  This 
“half-variant”  could  be  due  to  template  jumping,  which 
could  anomalously  combine  normal  and  variant  sequence 
(Thompson  et  al.  2002).  The  relatively  high  sequence  varia¬ 
tion  between  these  2  clones  suggests  that  these  copies  could 
be  from  nonfunctioning  rDNA  (pseudogenes).  To  test  this 
possibility,  we  compared  estimated  minimum  free  energy  lev¬ 
els  and  looked  at  the  secondary  structure  predicted  by  the 
program  MFOLD  (Zuker  et  al.  1999).  We  found  that  the 
folding  structures  of  these  2  clones  have  statistically  signifi¬ 
cantly  lower  energies  than  all  other  clones  (see  above)  and 
therefore  lower  structural  stability'.  In  addition,  the  variant 
clones  lack  a  stem  and  loop  at  the  1TS2  excision  site  present 


in  all  other  clones  (Figure  2).  It  is  possible  that  this  structural 
variation  could  affect  cleavage  efficienq'  of  the  precursor 
RNA,  and  it  leads  us  to  conclude  that  these  copies  probably 
come  from  non  functioning  rDNA  (pseudogenes).  To  our 
knowledge,  this  is  the  first  of  such  report  in  a  mosquito,  but 
they  have  been  documented  in  other  organisms  (Brownell 
et  al.  1983;  Benevolenskaya  et  al.  1997;  Razafimandimbison 
et  al.  2004).  Further  work  is  clearly  needed  to  verify  this 
observation. 

Application  of  ITS2  Intragenomic  Variation 

Unambiguous  identification  of  Anopheles  malaria  vector  spe¬ 
cies  is  essential  for  the  study  of  an  array  of  factors  that  affect 
control  and  disease  transmission.  When  morphological  char¬ 
acters  are  not  available,  molecular  alternatives  must  be  found. 
In  the  case  of  the  Albitarsis  Complex,  we  initially  looked  at 
sequence  of  the  rDNA  ITS2  hoping  to  find  a  way  to  separate 
the  4  species.  Ordinarily,  it  is  possible  to  directly  sequence  the 
ITS2  without  ambiguity,  but  in  the  Albitarsis  Complex,  direct 
sequence  results  were  not  clear  because  of  intragenomic  var¬ 
iation.  Using  ITS2  clones,  we  were  able  to  identify  primer 
locations  that  were  not  compromised  by  intraspecific  and 
intragenomic  variability  (Li  and  Wilkerson  2005).  Such  var¬ 
iability'  often  cannot  be  seen  in  direct  sequencing  and  could 
lead  to  design  of  primers  that  will  give  erroneous  or  ambig¬ 
uous  results.  For  example,  at  position  236  (Table  2)  in 
A.  albitarsis ,  there  are  2  alleles,  ACC  present  and  ACC  absent. 
In  a  consensus  sequence,  ACC  absent  copies  are  preferen¬ 
tially  amplified  because  they  arc  more  common.  If  a  primer 
were  designed  based  on  ACC  present,  then  an  A.  albitarsis 
sample  would  be  misidentified  as  A.  albitarsis  B  or  A. 
deaneomm.  Similar  results  could  occur  with  primers  designed 
based  on  positions  43  and  328.  With  these  data,  we  were  able 
to  design  primers  for  the  4  species  previously  determined 
using  RAPDs  and  provide  an  identification  tool  for  an  im¬ 
portant  malaria  vector  group. 
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